Learning against opponents with bounded memory
نویسندگان
چکیده
Recently, a number of authors have proposed criteria for evaluating learning algorithms in multiagent systems. While well-justified, each of these has generally given little attention to one of the main challenges of a multi-agent setting: the capability of the other agents to adapt and learn as well. We propose extending existing criteria to apply to a class of adaptive opponents with bounded memory. We then show an algorithm that provably achieves an ǫ-best response against this richer class of opponents while simultaneously guaranteeing a minimum payoff against any opponent and performing well in self-play. This new algorithm also demonstrates strong performance in empirical tests against a variety of opponents in a wide range of environments.
منابع مشابه
Planning against fictitious players in repeated normal form games
Planning how to interact against bounded memory and unbounded memory learning opponents needs different treatment. Thus far, however, work in this area has shown how to design plans against bounded memory learning opponents, but no work has dealt with the unbounded memory case. This paper tackles this gap. In particular, we frame this as a planning problem using the framework of repeated matrix...
متن کاملLearning in games with more than two players
We address the problem of learning in repeated N-player (as opposed to 2-player) general-sum games. We describe an extension to existing criteria focusing explicitly on such settings. While there have been several criteria proposed recently for evaluating learning algorithms in multi-agent systems, most of this work has focused on the two-player setting. Relatively little work has addressed sit...
متن کاملLearning in the Presence of other Unknown Agents
The field of multiagent learning is concerned with the study of how agents can learn and adapt in the presence of other agents that are simultaneously learning and adapting. To this end, the main agenda of this dissertation will be to develop multiagent learning algorithms that learn to maximize their own payoff over time when interacting repeatedly with the same set of unknown agents. In parti...
متن کاملPerformance Bounded Reinforcement Learning in Strategic Interactions
Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of selfadapting software. In this paper we address the latter issue in the con...
متن کاملUnifying Convergence and No-Regret in Multiagent Learning
We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005